I recently troubleshoot an interesting MySQL crash, and I think it is worth sharing (with the related bugs). MySQL crashed when the disk was full, you can see the free disk graph below. The Y-axis is in the tens of GiB scale and the X-axis is in the hour scale. Can you guess what happened ?
Just to make sure we agree on the meaning of the graph above, let's describe it :
- we start with tens of GiB of free disk space;
- then slowly, over more than 60 minutes, more than half of the free disk space is consumed;
- the disk is then quickly filled (not explicitly shown in the graph, MySQL crashed there);
- finally, tens of GiB are released (at the moment of the crash / just after the crash).
If you are not sure of what happened yet, let`s add something not necessarily obvious from the graph : the amount of disk space freed after the crash matches what is slowly consumed (before quickly filling the disk).
Still unsure, check my previous post : Interesting Binary Logging Optimization in <a MySQL fork>.
(Sorry to be a little convoluted in hiding the name of the fork in above, my previous post did not show-up on Planet MySQL, and I am attempting this one to be aggregated, so I need to minimize references to that fork. If you are looking for an alternative to Planet MySQL to get community news, check-out Planet for the MySQL Community.)
And because the answer is (almost) always in the logs, here is what they contain :
2025-xx-yyTzz:12:18.469308Z 1916497 [ERROR] [MY-011072] [Server] Binary logging not possible. Message: An error occurred during flush stage of the commit. 'binlog_error_action' is set to 'ABORT_SERVER'. Server is being stopped.
You probably figured it out by now : a big transaction caused the production of a lot of binary logs, which filled the disk, and caused MySQL to crash. But also in MySQL (not the MySQL fork with the recent optimization), ...
Binary Logging needs Double the Disk Space !
And in the graph above, the rapid releasing of disk space matched the deletion of the temporary file for spilling binary logs to disk.
If you do not understand above, read my previous post where I explain how binary logging happens for big transactions (tmp file and double disk space). I also find interesting that with the optimization in the MySQL fork presented in this previous post, the crash of this post would not have happened.
A new situation where COMMIT can fail !
I think MySQL should not start committing such a transaction, because it should know it is painting itself in a corner by filling the disk. And for this, I opened Bug #118356 : MySQL attempts committing transaction not fitting on disk, then crashes. Note that fixing this bug would introduce a situation where commit fails, which is unheard of in MySQL (the only other situation I know where commit can fail in MySQL is on conflict detection in Group Replication).
If you want to reproduce such a crash, the steps are in the above bug and in annex #1, with the annex containing extra commands to observe the size of the binary log temporary file and the amount of free disk space.
But there is more to write about this. After the crash, MySQL does not restart "like a charm". The uncommitted transaction needs to be rolled back, which takes time. Also, the partial transaction needs to be cleaned from the binary log. We can see the rollback delay and the binary log cleanup in the extended free disk graph below...
...and the rollback delay in the matching error logs extract below.
2025-xx-yyTzz:12:18.469308Z 1916497 [ERROR] [MY-011072] [Server] Binary logging not possible. Message: An error occurred during flush stage of the commit. 'binlog_error_action' is set to 'ABORT_SERVER'. Server is being stopped. [...] 2025-xx-yyTzz:12:20.297403Z 1 [System] [MY-013576] [InnoDB] InnoDB initialization has started. 2025-xx-yyTzz:14:14.898222Z 1 [System] [MY-013577] [InnoDB] InnoDB initialization has ended. 2025-xx-yyTzz:16:02.120982Z 0 [System] [MY-010229] [Server] Starting XA crash recovery... 2025-xx-yyTzz:42:50.287434Z 0 [System] [MY-010232] [Server] XA crash recovery finished. [...]
We can see large gaps in the logs above, and for this, I opened Bug #118358 : Startup without Signs of Progress when Rolling-Back Big Transaction.
A MySQL fork improved their startup in such a situation in a creative way (this feature is a contribution by Libing Song of Alibaba Cloud). Instead of waiting for rollback to complete before allowing connections, the rollback is done in the background while considering the Server up ! They blogged about this optimization (Rollback Prepared Transactions Asynchronously During Binlog Crash Recovery) and I shared my thought on LinkedIn (direct link). This optimization would be interesting in MySQL, but it is not of the highest priority for me, and I would prefer different things to be implemented (including Bug #118332 from my previous post), so I did not open a feature request.
I also think MySQL could avoid crashing in the situation described in this post. Because it is able to recover after the crash, it could "just recover without crashing" by fixing the binary logs and rolling back the transaction. However, this is not a priority for me either, so I did not open a feature request. I think avoiding this situation by not starting commit (Bug #118356), or removing the need for extra disk space on commit (Bug #118332), are better investment.
I am doing the same quick closing comment as the previous post: it is usually a MySQL best practice to avoid big transactions. Me suggesting optimizing the binary logging of big transactions does not mean I endorse their usage. I still think big transactions should be avoided (transactions should be smaller than 10 MiB). Independently of that best practice, MySQL should make big transactions quicker and leaner to commit.
Annex #1 : Crash during Binary Logs Copy Because not Enough Free Disk Space
The timing of below is on a m6id.large AWS vm (2 vCPU, 8 GiB RAM, and 118 GB local SSD). There is no timing output for the UPDATE, as it will block and then fail, but it is shown in my previous post that it would normally take 6:11 to succeed.
# Create a sandbox for our tests. dbdeployer deploy single mysql_8.4.5 # Creating a table, inserting 1 Mi rows, and growing each row to about 4 KiB. # (the pv commands are a trick to time command execution) { nb_rows=$((1024*1024)) ./use <<< " CREATE DATABASE test_jfg; CREATE TABLE test_jfg.t (id INT AUTO_INCREMENT PRIMARY KEY, a INT DEFAULT 0)" seq 1 $nb_rows | awk '{print "(null)"}' | tr " " "," | paste -s -d "$(printf ',%.0s' {1..100})\n" | sed -e 's/.*/INSERT INTO t(id) values &;/' | ./use test_jfg | pv -tN insert y200="$(yes | head -n 200 | paste -s -d "")" y240="$(yes | head -n 240 | paste -s -d "")" { echo "ALTER TABLE t ADD COLUMN c0 CHAR(200) DEFAULT '$y200'" seq -f " ADD COLUMN c%.0f CHAR(240) DEFAULT '$y240'" 1 15 } | paste -s -d "," | ./use test_jfg ./use test_jfg <<< "ALTER TABLE t FORCE" | pv -tN alter ./use test_jfg <<< "FLUSH TABLE t FOR EXPORT" | pv -tN flush } insert: 0:00:08 alter: 0:04:00 flush: 0:00:00 # Here, make sure there is MORE than 8 GiB and LESS than 15 GiB of free disk space. # (in my previous post, we saw below UPDATE generates 7.6 GiB of binlogs). # Update all rows of the table, filling the disk. ./use test_jfg <<< "UPDATE t SET a = a+1" | pv -tN update # While running the UPDATE above, below can be used to see the size of the tmp binlog file. while sleep 1; do lsof="$(lsof | sed -n -e 1p -e /mysqld.*deleted/p | awk 'NF == 10 && $7 != 0{print $7 /1024 /1024}')" echo "$(date): $lsof" done # While running the UPDATE above, below can be used to see the amount of free disk space. while sleep 1; do echo "$(date): $(df -m . | tail -n 1 | awk '{print $4}')" done # The error log contains below when the disk is full. 2025-06-02T18:16:31.247369Z 14 [ERROR] [MY-000035] [Server] Disk is full writing './binlog.000001' (OS errno 28 - No space left on device). Waiting for someone to free space... Retry in 60 secs. Message reprinted in 600 secs. # When the disk is full, kill the UPDATE query to crash MySQL. ./use <<< "SHOW PROCESSLIST" | awk '$0 ~ "UPDATE"{print "kill " $1 ";"}' | ./use # The error log contains below for the crash. 2025-06-02T18:17:02.249380Z 14 [ERROR] [MY-010907] [Server] Error writing file 'binlog' (errno: 28 - No space left on device) 2025-06-02T18:17:02.309586Z 14 [ERROR] [MY-011072] [Server] Binary logging not possible. Message: An error occurred during flush stage of the commit. 'binlog_error_action' is set to 'ABORT_SERVER'. Server is being stopped.. 2025-06-02T18:17:02Z UTC - mysqld got signal 6 ; Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware. [...] Trying to get some variables. Some pointers may be invalid and cause the dump to abort. Query (7f6d141eca70): UPDATE t SET a = a+1 Connection ID (thread ID): 14 Status: KILL_CONNECTION
No comments:
Post a Comment